{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 通过pyODPS访问MaxCompute的例子\n",
    "\n",
    "MaxCompute原名ODPS，所以为了向前兼容，所以MaxCompute的python的包叫做odps（pyodps），这个例子只是简单example，详细的odps的python sdk的文档请参见https://pyodps.readthedocs.io/zh_CN/latest/index.html\n",
    "\n",
    "我们确保有pyodps这个包，你可以在terminal用`pip list`看看是不是有pyodps的包\n",
    "![pip](./odpspip.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "然后我们import `odps` "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 101,
   "metadata": {},
   "outputs": [],
   "source": [
    "from odps import ODPS\n",
    "from odps.df import DataFrame"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "通过endpoint和AK进行连接，这里我们访问MaxCompute的公开数据级，关于这个数据集的情况，大家可以参考https://yq.aliyun.com/articles/89763 ，同时也可以通过数加体验馆（https://data.aliyun.com/experience ）来使用这些数据，近距离的感受阿里云其他大数据产品。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 113,
   "metadata": {},
   "outputs": [],
   "source": [
    "access_id = '<your access id>'\n",
    "access_key = '<your access key>'\n",
    "project = 'public_data'\n",
    "endpoint = 'http://service.cn.maxcompute.aliyun.com/api'\n",
    "o = ODPS(access_id, access_key, project, endpoint)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们可以通过`list_tables` 来查看项目中有多少表以ods_enterprise_share开头"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 116,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ods_enterprise_share_basic\n",
      "ods_enterprise_share_quarter_cashflow\n",
      "ods_enterprise_share_quarter_growth\n",
      "ods_enterprise_share_quarter_operation\n",
      "ods_enterprise_share_quarter_profit\n",
      "ods_enterprise_share_quarter_report\n",
      "ods_enterprise_share_trade_h\n"
     ]
    }
   ],
   "source": [
    "tables = o.list_tables(prefix='ods_enterprise_share')\n",
    "for table in tables:\n",
    "    print table.name"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "通过`get_table` 来获取表以及它的schema"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 117,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "odps.Schema {\n",
       "  code                            string      # 代码\n",
       "  name                            string      # 名称\n",
       "  industry                        string      # 所属行业\n",
       "  area                            string      # 地区\n",
       "  pe                              string      # 市盈率\n",
       "  outstanding                     string      # 流通股本\n",
       "  totals                          string      # 总股本(万)\n",
       "  totalassets                     string      # 总资产(万)\n",
       "  liquidassets                    string      # 流动资产\n",
       "  fixedassets                     string      # 固定资产\n",
       "  reserved                        string      # 公积金\n",
       "  reservedpershare                string      # 每股公积金\n",
       "  eps                             string      # 每股收益\n",
       "  bvps                            string      # 每股净资\n",
       "  pb                              string      # 市净率\n",
       "  timetomarket                    string      # 上市日期\n",
       "  undp                            string      # 未分利润\n",
       "  perundp                         string      # 每股未分配\n",
       "  rev                             string      # 收入同比(%)\n",
       "  profit                          string      # 利润同比(%)\n",
       "  gpr                             string      # 毛利率(%)\n",
       "  npr                             string      # 净利润率(%)\n",
       "  holders_num                     string      # 股东人数\n",
       "}\n",
       "Partitions {\n",
       "  ds                              string      \n",
       "}"
      ]
     },
     "execution_count": 117,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "t = o.get_table('ods_enterprise_share_basic')\n",
    "t.schema"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 119,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "True\n"
     ]
    }
   ],
   "source": [
    "print t.exist_partition('ds=20170113')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "然后你就可以通过以下接口来访问MaxCompute中的数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 122,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "odps.Record {\n",
      "  code                            u'code'\n",
      "  name                            u'name'\n",
      "  industry                        u'industry'\n",
      "  area                            u'area'\n",
      "  pe                              u'pe'\n",
      "  outstanding                     u'outstanding'\n",
      "  totals                          u'totals'\n",
      "  totalassets                     u'totalAssets'\n",
      "  liquidassets                    u'liquidAssets'\n",
      "  fixedassets                     u'fixedAssets'\n",
      "  reserved                        u'reserved'\n",
      "  reservedpershare                u'reservedPerShare'\n",
      "  eps                             u'esp'\n",
      "  bvps                            u'bvps'\n",
      "  pb                              u'pb'\n",
      "  timetomarket                    u'timeToMarket'\n",
      "  undp                            u'undp'\n",
      "  perundp                         u'perundp'\n",
      "  rev                             u'rev'\n",
      "  profit                          u'profit'\n",
      "  gpr                             u'gpr'\n",
      "  npr                             u'npr'\n",
      "  holders_num                     u'holders'\n",
      "  ds                              None\n",
      "}\n",
      "odps.Record {\n",
      "  code                            u'300584'\n",
      "  name                            u'N\\u6d77\\u8fb0'\n",
      "  industry                        u'\\u5316\\u5b66\\u5236\\u836f'\n",
      "  area                            u'\\u6c5f\\u82cf'\n",
      "  pe                              u'30.77'\n",
      "  outstanding                     u'0.2'\n",
      "  totals                          u'0.8'\n",
      "  totalassets                     u'34061.55'\n",
      "  liquidassets                    u'9754.3'\n",
      "  fixedassets                     u'15028.7'\n",
      "  reserved                        u'7264.67'\n",
      "  reservedpershare                u'0.91'\n",
      "  eps                             u'0.39'\n",
      "  bvps                            u'4.37'\n",
      "  pb                              u'3.66'\n",
      "  timetomarket                    u'20170112'\n",
      "  undp                            u'11848.53'\n",
      "  perundp                         u'1.48'\n",
      "  rev                             u'0.0'\n",
      "  profit                          u'0.0'\n",
      "  gpr                             u'66.38'\n",
      "  npr                             u'16.06'\n",
      "  holders_num                     u'36511'\n",
      "  ds                              None\n",
      "}\n",
      "odps.Record {\n",
      "  code                            u'603639'\n",
      "  name                            u'N\\u6d77\\u5229\\u5c14'\n",
      "  industry                        u'\\u519c\\u836f\\u5316\\u80a5'\n",
      "  area                            u'\\u5c71\\u4e1c'\n",
      "  pe                              u'23.39'\n",
      "  outstanding                     u'0.3'\n",
      "  totals                          u'1.2'\n",
      "  totalassets                     u'96895.55'\n",
      "  liquidassets                    u'58693.55'\n",
      "  fixedassets                     u'17550.51'\n",
      "  reserved                        u'8883.05'\n",
      "  reservedpershare                u'0.74'\n",
      "  eps                             u'1.152'\n",
      "  bvps                            u'7.43'\n",
      "  pb                              u'4.84'\n",
      "  timetomarket                    u'20170112'\n",
      "  undp                            u'45427.34'\n",
      "  perundp                         u'3.79'\n",
      "  rev                             u'1.09'\n",
      "  profit                          u'13.13'\n",
      "  gpr                             u'35.79'\n",
      "  npr                             u'16.0'\n",
      "  holders_num                     u'32221'\n",
      "  ds                              None\n",
      "}\n",
      "odps.Record {\n",
      "  code                            u'603628'\n",
      "  name                            u'N\\u6e05\\u6e90'\n",
      "  industry                        u'\\u7535\\u6c14\\u8bbe\\u5907'\n",
      "  area                            u'\\u798f\\u5efa'\n",
      "  pe                              u'33.79'\n",
      "  outstanding                     u'0.68'\n",
      "  totals                          u'2.74'\n",
      "  totalassets                     u'117525.67'\n",
      "  liquidassets                    u'97856.28'\n",
      "  fixedassets                     u'13912.67'\n",
      "  reserved                        u'11301.3'\n",
      "  reservedpershare                u'0.41'\n",
      "  eps                             u'0.178'\n",
      "  bvps                            u'2.6'\n",
      "  pb                              u'3.08'\n",
      "  timetomarket                    u'20170112'\n",
      "  undp                            u'20116.61'\n",
      "  perundp                         u'0.73'\n",
      "  rev                             u'0.0'\n",
      "  profit                          u'0.0'\n",
      "  gpr                             u'27.51'\n",
      "  npr                             u'9.38'\n",
      "  holders_num                     u'68468'\n",
      "  ds                              None\n",
      "}\n",
      "odps.Record {\n",
      "  code                            u'002824'\n",
      "  name                            u'N\\u548c\\u80dc'\n",
      "  industry                        u'\\u94dd'\n",
      "  area                            u'\\u5e7f\\u4e1c'\n",
      "  pe                              u'23.72'\n",
      "  outstanding                     u'0.3'\n",
      "  totals                          u'1.2'\n",
      "  totalassets                     u'60617.38'\n",
      "  liquidassets                    u'32629.4'\n",
      "  fixedassets                     u'20764.1'\n",
      "  reserved                        u'15127.51'\n",
      "  reservedpershare                u'1.26'\n",
      "  eps                             u'0.446'\n",
      "  bvps                            u'4.49'\n",
      "  pb                              u'3.14'\n",
      "  timetomarket                    u'20170112'\n",
      "  undp                            u'14318.35'\n",
      "  perundp                         u'1.19'\n",
      "  rev                             u'8.59'\n",
      "  profit                          u'56.76'\n",
      "  gpr                             u'24.79'\n",
      "  npr                             u'9.69'\n",
      "  holders_num                     u'56644'\n",
      "  ds                              None\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "with t.open_reader(partition='ds=20170113') as reader:\n",
    "    count = reader.count\n",
    "    for record in reader[:5]:\n",
    "        print record"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
